A r t i c l e s
Navigation

Note: This site is
a bit older, personal views
may have changed.

M a i n P a g e

D i r e c t o r y

Where The Hell Is The Connect Call


You're looking through the code of a project that is becoming large scale with lots of traffic (imagine, performance actually matters, laughing) and you can't find out where the program is connecting to the database. Pretend you are the next google company.

You are trying to make the search engine faster by making database connections and database close calls less often. Lazy loading (only connecting when absolutely neccessary) and caching content (pre-saving content as HTM snippets comes to mind)

(actually it would be nice if databases had database connection pooling and more caching abilities built in to them so the programmer doesn't have to worry about building their own caches and pooling systems).

So anyway, you are looking through the code and you notice this abstracted object oriented shit and you can't find the darn DbConnect() call. Who the hell knows where it is. It's mapped up in some Create() method 6 or 60 levels up from the code you know and control.

Then you say, mmmm.. let's just start over from scratch here we'll get this sucker done properly if we just rewrite it in a way that's more maintainable. We'll manage the connection ourselves so we can lazy load and do caching instead of connecting on most requests.

So you do this:

  ShowHeader;
  if not caching then 
    DbConnect                          <------- WOW! What's that I see?
  else 
    DoCaching;
  ShowContent;
  ShowFooter;
  if not caching then DbClose;         <------- OH? What's this here?
Ahh, and where the hell is the DbConnect() call? Oh it's a procedure that was called and it happens in logical procedural order that I can follow. A few procedures happen one after the other and now we know where the fuck the connection is taking place.

(what would be nice is a database that you don't have to connect to at all for common requests - so that none of these connect issues would arise, i.e. some sort of cached database or database file system that doesn't require authentication or connections. Maybe Google File System or Google Technology has this for their search engine, because I doubt Google has time for connections and authentications or even pooling. Why authenticate if you are accessing fully readable data (no logins or passwords, just public data for example). When you grab a file from the drive you do not authenticate each time you need to read the file!)

Now we could even add some tricky connection pooling in if our caching and lazy loading doesn't work wonders for us.. but the point is that programs which are MAINTAINABLE sometimes need to be written in a logical procedural orderly manner where you see and know what is going on.

Hidden WordPress Connect

Wordpress is a popular blog web software. Let's take a typical real world example. If you look through the WordPress code, the connect call is hidden away in some include file in some object creation. And the WordPress connects each time the pages are loaded! Now imagine how many times a blog changes.. not that often. So why connect to the DB on each and every request?

We have this code executed:

  $wpdb = new wpdb(DB_USER, DB_PASSWORD, DB_NAME, DB_HOST);
What's that do?
	// ==================================================================
	//	DB Constructor - connects to the server and selects a database

	function wpdb($dbuser, $dbpassword, $dbname, $dbhost) {
		$this->dbh = @mysql_connect($dbhost, $dbuser, $dbpassword);
		if (!$this->dbh) {
			$this->bail("... content snipped ... ");
		}

		$this->select($dbname);
	}
Every time an object is created you create a connection to the database. But no worry right, I mean.. hiding the connect call in the constructor allows us to encapsulate and reuse and all that. Huff puff.

Maybe WordPress will improve over time and this criticism will be outdated then. Who knows. The good news with WordPress is that you can actually find the Connect() call in the code! It's still there at least.. But with other frameworks the connect calls are even more hidden away.

Most servers could care less about a few extra connect calls. Most people build home pages that receive small amounts of traffic. But on busy shared environments it can make a difference (say 300 blogs are installed on a single server).

Hide Code To Reuse It?

Hiding things may be dangerous. Abstraction can be good, but when it comes to scalability.. where the fuck is the connect() call? Which abstracted object calls the connect in some other framework that I didn't write and don't have a clue how it works, since the whole point of the framework is to be 'magical' where everything happens 'magically'.

Reuse is good - but can I modify the code? Can I modify the magic?

Business Objects!

Alright, none of this above happened to me word for word. Most people spend so much money on hardware that performance never matters and the connection calls don't really have that much effect on the web app. But if it did need to be optimized, I'd like to be able to locate the damn code to optimize.. and a database connection is something I would optimize (reduce connection calls wherever possible). So please don't lock me into some mapped shit where connects happen whenever I create an object or something ridiculous like that. I want to be able to cut that Connect call like it is a piece of cake, and then put my own icing around it if I need to. No hiding it from me. I do custom solutions, thanks.

And yes, you could implement caching into an abstracted framework too by contacting the framework authors and begging them, since you can't understand the framework code yourself.

My point is: where the fuck is the connect call and can I gain access to it easily myself if I need to refine and modify the code? Can I get to it?

If not, how are you going to scale? How are you going to improve the app so it doesn't connect on every instantiation of the WhateverFuckingObject you are trying to use and abuse?

What custom solutions can you offer when we have a performance bottleneck? Are you gonna contact the boys who wrote the framework that does all the database stuff for you - or did the framework boys leave it up to you, Dick? Can you gain access to the connect procedure and delay connecting by caching first? Or is your connect() call shoved so far up your abstracted asshole that you can't even gain access to it or modify the code near it safely any more?

10 Second Rule

If you can't find the CONNECT call within 10 seconds of me asking, and if you can't modify the code around the connect call to suit you're needs, then you're FUCKED. It's kind of like trying to find the STD OUT function in your app. If it is hidden away somewhere or if it is shoved and tucked away somewhere that makes it really hard to get at, then you are FUCKED. No matter what perfect by the book framework you are using.

You'd better be able to locate the connect call and you'd better be able to move it around and optimize it (unless your framework is so perfect that you never have to do any customization, which, is a RARE case).

Story Line

Consider something like this:
  VISITOR: 
    Today I GetUpOutOfBed()
    Then I OpenTheFridge()

  VISITOR: 
    But the Fridge Opener Instantiation On the Create call Doesn't Work and the 
    fridge app says "out of fridge door open swings today, sorry, can't handle this 
    many opens". I'm gonna email the fridgemaster and tell him his fridge site is 
    broken!

  PROGRAMMER 1:
     (programmer1 is a newbie, just got out of university, book smart)

     I got your email VISITOR. Thanks.  We're working on fixing the error you got.
     We've had a lot of visitors to our fridge and sometimes the door won't open 
      because we have too many people trying to open it.

     (Now programmer starts debugging and talking to himself. Another programmer 
     walks in the room.)

     Oh, hi programmer2, we have a problem. Visitors are telling us the Fridge is 
      giving them errors when they visit the site! It's connection issues.

     The ORM thing does all the connection stuff for me. When I create
     the object it all happens magically somewhere. Programmer2, how are we going
     to fix and scale this fridgeapp with all these open fridge door issues?

  PROGRAMMER 2:
     (programmer2 is an experienced programmer)

     What ORM object? How many objects and where does it really connect? 

     Look, just tell me where is the OpenFridge procedure exactly?

  PROGRAMMER 1 (boob):
     Huh? procedure? There is no procedure, we are doing object oriented code.

  PROGRAMMER 2:
     Yeah, but, um, where is the connection occuring?

  PROGRAMMER 1 (boob):
     I dunno, uhm, somewhere.. in the ORM thing. I dunno who wrote it, but its cool.
     It saves us time and it does this automatic mapping for us and there's no worry 
     about connections or SQL garbage (Shitty Query Language) or any of that 
     nonsense. 

     And this dude that writes about Software Patterns.. he wrote this book about how
     encapsulation is the best and that we need to reuse code!

  PROGRAMMER 2:
     Uhm, and where did you say the OpenFridge() procedure was again? Because 
     our website is broken and we need to scale. We got too many people opening the 
     fridge at once. We need to find the OpenFridge code and add caching and 
     connection avoidance code. In order to do this we need to find where 
     OpenFridge() is called and we need to add the code around it to enable caching 
     based on some 'if logic' so that the connections aren't opened when we don't 
     need them. 

  PROGRAMMER 1 (boob):
     Uhm.. I dunno, the ORM does it all for us and they said that we shouldn't worry 
     about database connections or things like that.. more importantly we instantiate 
     the business object and it becomes our business instantly!

     And the connect call? No such thing! Got nothin' ta do with business.

  PROGRAMMER 2:
     I'm rewriting it all. Fuck the frameworks you've been using. I'll take care of 
     the content caching and we'll add pooling if we need to but lazy loading and 
     caching should be good. I'll put the connect() calls somewhere that we can 
     maintain in logical order near the cache check code. You can do customer service
     and become the project manager.

  PROGRAMMER 1 (boob):
     Are you going to add this code into the frameworks we are using now?
     We're going to reuse all the code and frameworks! Code reuse!

  PROGRAMMER 2:   
     Uhm, yeah, sure, sure, some code we might make use of.. look.. you just worry 
     about the customers, I'll handle the code. M'kay?

  PROGRAMMER 1 (boob):
     Okay! once we have the site handle more traffic we'll make big money!
     Then I can work for microsoft or we can sell the business if it works!


No, no, none of this ever happened to me.. I just realized how lucky I am to sometimes be able to go into other peoples' code, still being able to actually modify it and read it logically! How wonderful and reusable, and modifiable, and scalable. Other (most?) times, I'm fucked though.

About
This site is about programming and other things.
_ _ _